Skip to main content
Feedback

Blueprint components and configuration

Blueprints provide a YAML-based framework for defining custom connectors. This structured configuration allows you to connect to REST APIs and build scalable data pipelines without the need for custom scripting.

Blueprint YAML structure

Each Blueprint is composed of the following core components:

ComponentDescriptionRequired
interface_parametersUser-configurable inputs (authentication, filters, dates)No
connectorAPI connection settings (base URL, headers, storage)Yes
stepsWorkflow logic (REST calls, loops, data extraction)Yes
Example: Complete structure configuration
# 1. Interface Parameters - User inputs displayed in River configuration
interface_parameters:
section:
source:
- name: "api_credentials"
type: "authentication"
auth_type: "bearer"
fields:
- name: "bearer_token"
type: "string"
is_encrypted: true
- name: "date_range"
type: "date_range"
period_type: "date"
format: "YYYY-mm-DD"
fields:
- name: "start_date"
value: ""
- name: "end_date"
value: ""

# 2. Connector Configuration - API connection settings
connector:
name: "My API Connector"
base_url: "https://api.example.com/v1"
default_headers:
Content-Type: "application/json"
Accept: "application/json"
default_retry_strategy:
500:
max_attempts: 3
retry_interval: 10
429:
max_attempts: 5
retry_interval: 60
variables_metadata:
final_output_file:
format: "json"
storage_name: "results_dir"
variables_storages:
- name: "results_dir"
type: "file_system"

# 3. Steps - Workflow logic
steps:
- name: "Fetch Data"
description: "Retrieve data from the API"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/data"
query_params:
start_date: "{date_range.start_date}"
end_date: "{date_range.end_date}"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.data"

Connector Configuration

The connector configuration establishes the foundational settings for your API integration.

Configuration fields

FieldDescriptionRequired
nameDescriptive name for the connector.Yes
base_urlRoot URL for all API requests.Yes
default_headersHeaders sent with every request.No
default_retry_strategyRetry policies for failed requests.No
variables_metadataVariable storage configuration.Yes
variables_storagesStorage location definitions.Yes

Base URL

The base URL is the root endpoint for your API. All step endpoints are appended to this URL.

connector:
name: "Salesforce Connector"
base_url: "https://mycompany.salesforce.com/services/data/v58.0"
note

Do not include trailing slashes in the base URL.

Default headers

Headers that should be sent with every API request:

connector:
default_headers:
Content-Type: "application/json"
Accept: "application/json"
X-Custom-Header: "custom-value"
note

Do not include Authorization headers here - authentication is automatically injected from interface parameters.

Default retry strategy

Configure automatic retries for specific HTTP status codes. Each status code can have a unique maximum attempt count and interval.

  • Max attempts: The number of times the connector tries to re-establish the connection.
  • Retry interval: The duration (in seconds) to wait between attempts.
connector:
default_retry_strategy:
429: # Rate Limited
max_attempts: 5
retry_interval: 60 # seconds
500: # Internal Server Error
max_attempts: 3
retry_interval: 10
502: # Bad Gateway
max_attempts: 3
retry_interval: 10
503: # Service Unavailable
max_attempts: 3
retry_interval: 30
504: # Gateway Timeout
max_attempts: 3
retry_interval: 10

Variables storage

The variables storage configuration defines how and where the connector stores extracted data during execution.

connector:
variables_metadata:
final_output_file:
format: "json"
storage_name: "results_dir"
intermediate_data:
format: "json"
storage_name: "results_dir"
variables_storages:
- name: "results_dir"
type: "file_system"

Workflow steps

Steps define the execution logic of your connector. Steps execute sequentially. You can use data extracted from one step in subsequent steps.

Step types

The following table describes the available step types.

TypeDescription
restExecute an HTTP request (GET, POST, PUT, PATCH, DELETE).
loopIterates over a data collection, executing nested steps for each element.

REST step

A REST step executes a single HTTP request.

steps:
- name: "Get Users"
description: "Fetch all users from the API"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/users"
query_params:
status: "active"
limit: "100"
headers:
X-Request-ID: "unique-id"
retry_strategy:
500:
max_attempts: 3
retry_interval: 10
variables_output:
- response_location: "data"
variable_name: "users_list"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.data.users"

HTTP methods

The http_method determines the action the request performs on the resource.

MethodDescription
GETRetrieves data.
POSTCreate data or send payloads.
PUTUpdates or replaces data.
PATCHApplies partial updates to data.
DELETERemoves data.

POST request with body

Use the body field to define the data payload when creating or updating a record.

steps:
- name: "Create Record"
description: "Create a new record via POST"
type: "rest"
http_method: "POST"
endpoint: "{{%BASE_URL%}}/records"
headers:
Content-Type: "application/json"
body:
name: "{{%record_name%}}"
email: "{{%record_email%}}"
status: "active"
variables_output:
- response_location: "data"
variable_name: "created_record"
variable_format: "json"

Loop step

Loop steps iterate over collections of data and execute nested steps for each item.

steps:
# Step 1: Fetch list of account IDs
- name: "Get Account IDs"
description: "Retrieve all account IDs"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/accounts"
variables_output:
- response_location: "data"
variable_name: "account_ids"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.accounts[*].id" # Extract just the IDs

# Step 2: Loop through each account ID
- name: "Process Each Account"
description: "Fetch details for each account"
type: "loop"
loop:
type: "data"
variable_name: "account_ids"
item_name: "account_id" # Each item IS the ID
add_to_results: true
ignore_errors: false
steps:
- name: "Get Account Details"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/accounts/{{%account_id%}}/details"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
overwrite_storage: false
important

The item_name represents the entire current item in the iteration. To use specific properties, for example, ID, extract them in the transformation layer so each loop item contains the required value.

Loop configuration options

The following table describes the fields for configuring a loop.

FieldDescriptionRequired
typeSpecifies the loop type: data, date_range, or whileYes
variable_nameIdentifies the variable containing the array to iterate.Yes
item_nameSets an alias for the current item in the iteration.Yes
add_to_resultsIncludes the loop output in the final results.Yes
ignore_errorsContinues the loop if individual items fail.No

External variables loop

When a loop is the first step in your workflow, you can iterate over data passed from the source River using the {ext.} syntax:

steps:
- name: "Process External IDs"
description: "Loop through IDs from source River"
type: "loop"
loop:
type: "data"
variable_name: "{ext.source_ids}" # External variable syntax
item_name: "item_id"
add_to_results: true
ignore_errors: true
steps:
- name: "Fetch Item"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/items/{{%item_id%}}"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
overwrite_storage: false
note

The {ext.} prefix can only be used in the first step of a workflow.

Accessing external dictionary properties

When an external variable is a dictionary (object), you can access its properties using dot notation.

steps:
- name: "Fetch Using External Config"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/users/{{%{ext.config.user_id}%}}"
query_params:
region: "{{%{ext.config.region}%}}"
note

Dot notation for property access ({{%variable.property%}}) only works with external dictionary variables. For standard loop items, extract the specific values in the transformation layer.

Variable outputs and transformations

Define the variables_output object to specify how the connector handles response data.

Variable output configuration

variables_output:
- response_location: "data" # data, header, or status_code
variable_name: "users_data"
variable_format: "json"
overwrite_storage: false # Append (false) or replace (true)
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.data.users[*]"

Response locations

The following table describes the available source locations for variables.

LocationDescription
dataExtracts content from the response body.
headerExtracts values from the response headers.
status_codeCaptures the numerical HTTP status code.

Transformation layers

Apply transformation layers to modify or filter response data before storing it.

transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.data.items[*]"

Supported transformations

The following table lists the available transformation types.

TypeDescription
extract_jsonExtracts specific data using JSONPath syntax.
extract_csvParses incoming CSV data into a usable format.
to_jsonConverts the data into JSON format.
to_csvConverts the data into CSV format.

Common JSONPath patterns

# Direct property
json_path: "$.data"

# Nested property
json_path: "$.data.users"

# All array items
json_path: "$.data.users[*]"

# Specific field from all items
json_path: "$.data.users[*].id"

# Root array
json_path: "$[*]"

# Last item in array
json_path: "$.data[-1].id"

Variable reference syntax

Refer the following syntax patterns to inject dynamic data into your configuration.

ContextSyntaxDescriptionExample
Internal variable{{%variable_name%}}References data from a previous step./users/{{%user_id%}}
Loop item{{%item_name%}}References the current loop item value./orders/{{%order_id%}}
Interface parameter{param_name}References a user input value.?status={status_filter}
Date range start{param.start_date}References a start date from a picker.from={dates.start_date}
Date range end{param.end_date}References an end date from a picker.to={dates.end_date}
External data{ext.variable_name}References data from a source River.{ext.incoming_ids}
External dict property{{%{ext.dict.property}%}}References a property from a dictionary.{{%{ext.config.user_id}%}}
Base URL{{%BASE_URL%}}References the connector base URL.{{%BASE_URL%}}/users

Workflow patterns

The following patterns shows common implementation strategies for connector workflows.

Pattern 1: Simple data fetch

Single REST step to fetch data:

steps:
- name: "Fetch All Records"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/records"
pagination:
type: "page"
location: "qs"
parameters:
- name: "page"
value: 1
increment_by: 1
- name: "per_page"
value: 100
break_conditions:
- name: "No More Data"
condition:
type: "empty_json_path"
key_json_path: "$.data"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"

Pattern 2: Parent-child relationship

Fetch a list, then get details for each item:

steps:
# Step 1: Get parent records
- name: "Get Organization IDs"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/organizations"
variables_output:
- response_location: "data"
variable_name: "org_ids"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.organizations[*].id" # Extract just the IDs

# Step 2: Loop through each and get child records
- name: "Get Org Members"
type: "loop"
loop:
type: "data"
variable_name: "org_ids"
item_name: "org_id" # Each item IS the org ID
add_to_results: true
steps:
- name: "Fetch Members"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/organizations/{{%org_id%}}/members"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
overwrite_storage: false

Pattern 3: Sequential API calls

Multiple independent REST steps in sequence:

steps:
# Step 1: Authenticate and get token
- name: "Get Access Token"
type: "rest"
http_method: "POST"
endpoint: "{{%BASE_URL%}}/auth/token"
body:
grant_type: "client_credentials"
variables_output:
- response_location: "data"
variable_name: "auth_token"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.access_token"

# Step 2: Use token to fetch data
- name: "Fetch Protected Data"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/protected/data"
headers:
Authorization: "Bearer {{%auth_token%}}"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"

Best practices

  • Naming conventions: Use clear and descriptive names.

    Example
    # Good
    variable_name: "user_profiles"
    variable_name: "order_transactions"
    variable_name: "final_output_file"

    # Avoid
    variable_name: "data"
    variable_name: "temp"
    variable_name: "x"
  • Error handling: Define configure retry strategies for common error codes.

    Example
    retry_strategy:
    429:
    max_attempts: 5
    retry_interval: 60
    500:
    max_attempts: 3
    retry_interval: 10
  • Pagination safety: Include break conditions in all paginated requests to prevent infinite loops.

    Example
    break_conditions:
    - name: "Primary: Empty Data"
    condition:
    type: "empty_json_path"
    key_json_path: "$.data"
    - name: "Safety: Page Size Check"
    condition:
    type: "page_size_break"
    page_size_param_name: "limit"
    items_json_path: "$.data"
  • Loop configuration:

    • Set ignore_errors: true if the workflow must continue despite individual item failures.
    • Set add_to_results: true to include loop outputs in the final result.
    • Test workflows with small data sets before executing full production runs.
  • Security:

    • Mark all sensitive fields with is_encrypted: true.
    • Never hardcode credentials in YAML.
    • Use interface parameters for all authentication.

Advanced features

The following advanced features are available for complex scenarios. For more information, refer to YAML reference guide.

FeatureDescription
PUT/PATCH/DELETE methodsAdditional HTTP methods for data modification.
Date range loopsIterates through date chunks.
While loopsRepeats an execution block until a specific condition is met.
Pre-run configurationExecutes setup steps before the main workflow starts.
Multi-report configurationGenerates multiple report outputs from one workflow.
Advanced break conditionsEvaluates string equality, numeric values, or compound OR logic.
CSV transformationsParses and converts data between JSON and CSV formats.
RFC8288 Link header paginationUses standard Link headers to manage paginated API responses.
On this Page